Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
                                            Some full text articles may not yet be available without a charge during the embargo (administrative interval).
                                        
                                        
                                        
                                            
                                                
                                             What is a DOI Number?
                                        
                                    
                                
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
- 
            Automatically Mitigating Vulnerabilities in Binary Programs via Partially Recompilable DecompilationPRD lifts suspect binary functions to source, available for analysis, revision, or review, and creates a patched binary using source- and binary-level techniques. Al- though decompilation and recompilation do not typically succeed on an entire binary, our approach does because it is limited to a few functions, such as those identified by our binary fault localization.more » « lessFree, publicly-accessible full text available May 1, 2026
- 
            Machine learning (ML) pervades the field of Automated Program Repair (APR). Algorithms deploy neural machine translation and large language models (LLMs) to generate software patches, among other tasks. But, there are important differences between these applications of ML and earlier work, which complicates the task of ensuring that results are valid and likely to generalize. A challenge is that the most popular APR evaluation benchmarks were not designed with ML techniques in mind. This is especially true for LLMs, whose large and often poorly-disclosed training datasets may include problems on which they are evaluated. This article reviews work in APR published in the field’s top five venues since 2018, emphasizing emerging trends in the field, including the dramatic rise of ML models, including LLMs. ML-based articles are categorized along structural and functional dimensions, and a variety of issues are identified that these new methods raise. Importantly, data leakage and contamination concerns arise from the challenge of validating ML-based APR using existing benchmarks, which were designed before these techniques were popular. We discuss inconsistencies in evaluation design and performance reporting and offer pointers to solutions where they are available. Finally, we highlight promising new directions that the field is already taking.more » « lessFree, publicly-accessible full text available March 22, 2026
- 
            Free, publicly-accessible full text available January 24, 2026
- 
            GPUs are used in many settings to accelerate large-scale scientific computation, including simulation, computational biology, and molecular dynamics. However, optimizing codes to run efficiently on GPUs requires developers to have both detailed understanding of the application logic and significant knowledge of parallel programming and GPU architectures. This paper shows that an automated GPU program optimization tool, GEVO, can leverage evolutionary computation to find code edits that reduce the runtime of three important applications, multiple sequence alignment, agent-based simulation and molecular dynamics codes, by 28.9%, 29%, and 17.8% respectively. The paper presents an in-depth analysis of the discovered optimizations, revealing that (1) several of the most important optimizations involve significant epistasis, (2) the primary sources of improvement are application-specific, and (3) many of the optimizations generalize across GPU architectures. In general, the discovered optimizations are not straightforward even for a GPU human expert, showcasing the potential of automated program optimization tools to both reduce the optimization burden for human domain experts and provide new insights for GPU experts.more » « lessFree, publicly-accessible full text available December 31, 2025
- 
            How do complex adaptive systems, such as life, emerge from simple constituent parts? In the 1990s, Walter Fontana and Leo Buss proposed a novel modeling approach to this question, based on a formal model of computation known as the λ calculus. The model demonstrated how simple rules, embedded in a combinatorially large space of possibilities, could yield complex, dynamically stable organizations, reminiscent of biochemical reaction networks. Here, we revisit this classic model, called AlChemy, which has been understudied over the past 30 years. We reproduce the original results and study the robustness of those results using the greater computing resources available today. Our analysis reveals several unanticipated features of the system, demonstrating a surprising mix of dynamical robustness and fragility. Specifically, we find that complex, stable organizations emerge more frequently than previously expected, that these organizations are robust against collapse into trivial fixed points, but that these stable organizations cannot be easily combined into higher order entities. We also study the role played by the random generators used in the model, characterizing the initial distribution of objects produced by two random expression generators, and their consequences on the results. Finally, we provide a constructive proof that shows how an extension of the model, based on the typed λ calculus, could simulate transitions between arbitrary states in any possible chemical reaction network, thus indicating a concrete connection between AlChemy and chemical reaction networks. We conclude with a discussion of possible applications of AlChemy to self-organization in modern programming languages and quantitative approaches to the origin of life.more » « less
- 
            The rapidly expanding use of wastewater for public health surveillance requires new strategies to protect privacy rights, while data are collected at increasingly discrete geospatial scales, i.e., city, neighborhood, campus, and building-level. Data collected at high geospatial resolution can inform on labile, short-lived biomarkers, thereby making wastewater-derived data both more actionable and more likely to cause privacy concerns and stigma- tization of subpopulations. Additionally, data sharing restrictions among neighboring cities and communities can complicate efforts to balance public health protections with citizens’ privacy. Here, we have created an encrypted framework that facilitates the sharing of sensitive population health data among entities that lack trust for one another (e.g., between adjacent municipalities with different governance of health monitoring and data sharing). We demonstrate the utility of this approach with two real-world cases. Our results show the feasibility of sharing encrypted data between two municipalities and a laboratory, while performing secure private com- putations for wastewater-based epidemiology (WBE) with high precision, fast speeds, and low data costs. This framework is amenable to other computations used by WBE researchers including population normalized mass loads, fecal indicator normalizations, and quality control measures. The Centers for Disease Control and Pre- vention’s National Wastewater Surveillance System shows ~8 % of the records attributed to collection before the wastewater treatment plant, illustrating an opportunity to further expand currently limited community-level sampling and public health surveillance through security and responsible data-sharing as outlined here.more » « less
 An official website of the United States government
An official website of the United States government 
				
			 
					 
					
